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METHOD FOR PROCESSING 3-D AUDIO SCENE WITH EXTENDED 
SPATIALITY OF SOUND SOURCE 

Description 

5 Technical Field 

The present invention relates to a method for 
processing a three-dimensional audio scene having sound 
source whose spatiality is extended; and, more 
10 particularly, to a method for processing a three- 
dimensional audio scene to extend the spatiality of sound 
source in a three -dimensioiial audio scene. 

Background Art 

15 

Generally, a content providing server encodes contents 
in a predetermined encoding method and transmits the 
encoded contents to content consuming terminals that 
consume the contents. The content consuming terminals 
20 decode the contents in a predetermined decoding method and 
output the transmitted contents. 

Accordingly, the. content providing server includes an 
encoding unit for encoding the contents and a transmission 
unit for transmitting the encoded contents. On the other 
25 hand, the content consuming terminals includes a reception 
unit for receiving the transmitted encoded contents, a 
decoding unit for decoding the encoded contents, and an 
output unit for outputting the decoded contents to users- 

Many encoding/decoding methods of audio/video signals 
30 are known so far. Among them, an encoding /decoding method 
based on Moving Picture Escperts tSroup 4 {MPEG-4) is widely 
used these days. MPEG-4 is a technical standard for data 
compression and restoration technology defined by the MPEG 
to transmit moving pictures at a low transmission rate. 

35 
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According to.MPEG-4, an object of an arbitrary shape 
can be encoded and the content consuming terminals consume 
a scene cornposed of a plurality of objects. Therefore, 
MPEG-4 defines Audio Binary Format for Scene (Audio BIFS) 
5 with a scene description language for designating a sound 
object expression method and the characteristics thereof. 

Meanwhile, along with the development in video, users 
want to consume contents of more lifelike sounds and video 
quality • In the MPEG-4 Audio Binary Format for Scene 
10 (Audio BIFS) , an AudioFX node and a Direct iveSound node are 
used to express spatiality of a three-dimensional audio 
scene. In these nodes, modeling of sound source is usually 
depended on point-source. Point-source can be described 
and embodied in a three-dimensional sound space easily. 
IS Actual point-sources, however, tend to have a 

dimension more than two, rather than to be a point of 
literal meaning. More important thing here is that the 
shape of the sound source can be recognized by human 
beings, which is disclosed by J. Baluert, -Spatial 
20 Hearing,'' the MIT Press, Cambridge Mass, 1996. 

For example, a sound of waves dashing against the 
coastline stretched in a straight line can be recognized as 
a linear sound source instead of a point sound source. To 
improve the sense of the real of the three-dimensional 
25 audio scene by using the Audio BIFS, the size and shape of 
the sound source should be expressed. Otherwise, the sense 
of the real of a souiid object in the three-dimensional 
audio scene would be damaged seriously. 

That is, the spatiality of a sopnd source could be 
30 described to endow a three-dimensional audio scene with a 
sound source which is of more than one -dimensional- 

Disclosure of Invention 
35 It is, therefore, an object of the present invention 
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to provide a method for generating and consuming a three- 
dimensional audio 3cene having a sound source whose 
spatiality is extended by adding sound source 
characteristics information having information on extending 
5 the spatiality of the sound source to three-dimensional 
audio scene description information - 

The other objects and advantages of the present 
invention can be easily recognized by those of ordinary 
skill in the art from the drawings, detailed description 
10 and claims of the present specification. 

In accordance with one aspect of the present 
invention, there is provided a method for generating a 
three-dimensional audio scene with a sound source whose 
spatiality is extended, including the steps of: a) 
15 generating a sound object; and b) generating three- 
dimensional audio scene description information including 
sound source characteristics information for the sound 
object, wherein the sound source characteristics 
information includes spatiality extension information of 
20 Che sound source which is information on the size and shape 
of the sound source eaqpressed in a three-dimensional space. 

In accordance with one aspect of the present 
Invention, there is provided a method for consuming a 
three-dimensional audio scene with a sound source whose 
25 spatiality is extended, including the steps of; a) 
receiving a sound object and three-dimensional audio scene 
description information including sound source 
characteristics information for the sound object; and b) 
outputting the sound object based on the three-dimensional 
30 audio scene description information, wherein the sound 
source characteristics information includes spatiality 
extension inforroatton which is information on the si^e and 
shape of a sound source e35>ressed in a three-dimensional 
space • 

35 
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Brief Degcription of Drawings 

The above and other objects and features of the 
5 present invention will become apparent from the following 
description of the preferred ernbodiments given in 
conjunction with the accompanying drawings/ in which: 

Fig. X is a diagram illustrating various shapes of 
sound sources; 

10 Fig- 2 is a diagram describing a method for expressing 

spatial sound source by grouping successive point sound 
sources ; 

Pig. 3 shows an example where spatiality extension 
information is added to a "DirectiveSound" node of Audio 
15 BIFS in accordance with the present invention; 

Pig. 4 is a diagram illustrating how a sound source is 
extended in accordance with the present invention; and 

Fig. 5 is a diagram depicting the distributions of 
point sound sources based on the shapes of various sound 
20 sources in accordance with the present invention. 

IBeat Mode for Carrying Out the invention 

Other objects and aspects of the invention will become 
25 apparent from the following description of the embodiments 
with reference to the accompanying drawings, which is set 
forth hereinafter. 

Following description exemplifies only the principles 
of the present invention.' Even if they are not described 
30 or illustrated clearly in the present specification, one of 
ordinary skill in the art can embody the principles of the 
present invention and invent various apparatuses within the 
concept and scope of the present invention. 

The use of the conditional terms and embodiments 
35 presented in the present specification are intended only to 
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make the concept of the present invention understood, and 
they are not limited to the embodiments and conditions 
mentioned in the specification. 

In addition, all the detailed description on the 
principles, viewpoints and embodiments and particular 
embodiments of the present invention should be understood 
to include structural and functional equivalents to them. 
The equivalents include not only currently known 
equivalents but also those to be developed in future / that 
is, all devices invented to perform the same f\inction, 
regardless of their stxnctures. 

For example, block diagrams of the present invention 
should be understood to show a conceptual viewpoint of an 
exemplary circuit that embodies the principles of the 
present invention. Siroilarly, all the flowcharts, state 
conversion diagrams, pseudo codes and the like can be 
expressed substantially in a computer-readable media, and 
whether or not a computer or a processor is described 
distinctively, they should be understood to express various 
20 processes operated by a computer or a processor. 

Functions of various devices illustrated in the 
drawings including a functional block expressed as a 
processor or a similar concept can be provided not only by 
using hardware dedicated to the f\inctionB. but also by 
25 using hardware capable of running proper software for the 
functions. When a function is provided by a processor, the 
function may be provided by a single dedicated processor, 
single shared processor, or a plurality of individual 
processors, part of which can be shared. 

The apparent use of a term, 'proeessor', 'control' or 
similar concept, should not be understood to exclusively 
refer to a piece of hardware capable of running software, 
but should be understood to include a digital signal 
processor (DSP), hardware, and ROM, RAM and non-volatile 
memory for storing software, implicatively. Other known 

5 
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and commonly used hardware may be included therein, too. 

In the claims of the present specification, an element 
expressed as a means for performing a function described in 
the detailed description is intended to include all methods 
5 for performing the function including all formats of 
software, such as combinations of circuits for performing 
the intended function, firmware /microcode and the like. To 
perform the intended function, the element is cooperated 
with a proper circuit for performing the software. The 
10 present invention defined by claims includes diverse means 
for performing particular functions, and the means are 
connected with each other in a method requested in the 
claims. Therefore, any means that can provide the function 
should be understood to be an equivalent to what is figured 
15 out from the present specification. 

Other objects and aspects of the invention will become 
apparent from the following description of the embodiments 
with reference to the accompanying drawings, which is set 
forth hereinafter. The same reference numeral is given to 
the same element, although the element appears in different 
drawings. In addition, if further detailed description on 
the related prior arts is determined to blur the point of 
the present invention, the description is omitted. 
Hereafter, preferred embodiments of the present invention 
will be described in detail . 

Fig. 1 is a diagram illustrating various shapes and 
sizes of sound sources. Referring to Pig. l, a sound 
source can be a point, a line, a surface and space having a 
volume. Since sound source has an arbitrary shape and 
size, it is very complicated to describe the sound source. 
However, if the shape of the sound source to be modeled is 
controlled, the sound source can be described less 

complicatedly. 

in the present invention, it ia assumed that point 
sound sources are distributed uniformly in the dimension of 

6 
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a virtual sound source in order to model sound sources of 
various shapes and sizes. As a result, the sound sources 
of various shapes and sizes can be expressed as continuous 
mm^^m- arrays of point soiond sources. Here, the location of each 

S point sound source in a virtual object can be calculated 
using a vector location of a sound source which is defined 
in a three-dimensional scene. 

When a spatial sound source is modeled with a 
plurality of point sound sources, the spatial sound source 
10 should be described using a node defined in Audio BIFS. 
When the node defined in Audio BIFS, which will be referred 
to as an AudioFX node, is used, any effect can be included 
in the three-dimensional scene. Therefore, an effect 
corresponding to the spatial sound source can be programmed 
15 through the AudioFX node and inserted to the three- 
dimensional scene. 

However, this requires very complicated Digital Signal 
Processing (DSP) algorithm and it is very troublesome to 
control the dimension of the spatial sound source. 
20 Also, the point sound sources distributed in a lliRited 

dimension of an object are grouped using the Audio BIFS, 
and the spatial location and direction of the sound sources 
can be changed by changing the sound source group. First 
of all, the characteristics of the point sound sources are 
25 described using a plurality of "DirectiveSound- node. The 
locations of the point sound sources are calculated to be 
distributed on the surface of the object uniformly. 

Sxibsequently, the point sound sources are located with 
a spatial distance that can eliminate spatial aliasing, 
30 which is disclosed by A. J, Berkhout, D. de Vries, and P. 
Vogel, -Acoustic control by wave field synthesis." J. 
Aoust. Soc. Am., Vol. 93, No. 5 on pages from 2764 to 2*778. 
May, 1993. The spatial sound source can be vectorized by 
using a group node and grouping the point sound sources. 
35 Pig. 2 is an illustrative diagram depicting a scene of 

7 
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Audio BIFS. In the drawing, a virtual successive linear 
sound source is modeled by using three point sound sources 
which are distributed uniEoncily along the axis of the 
linear sound source . t^^'' 

5 The locations of the point sound sources are 

determined to be (x^-dx, Yo-dy, zo-dz), (xo, yo, Zo) , and 
(Xo+dx, yo-^dy, zo+dz) according to the concept of the 
virtual sound source. Here, dx, dy and dz can be 
calculated from a vector between a listener and the 

10 location of the sound source and the angle between the 
direction vectors of the sound source, the vector and the 
angle which are defined in an angle field and a direction 
field. 

Fig, 2 describes a spatial sound source by using a 

15 plurality of point sound sources. Audio BIFS appears it 
can support the description of a particular scene. 
However, this method requires too much unnecessary sound 
object definition. This is because many objects should be 
defined to model one single object • 

20 When it is told that the genuine object of hybrid 

description of Moving Picture Experts Group 4 (MPEG-4)is 
more object-oriented representations, it is desirable to 
combine the point sound sources, which are used for model 
one spatial sound source, and reproduce one single object. 

25 In accordance with the present invention, a new field 

is added to a "DirectiveSound" node of the Audio BIPS to 
describe the shape and size attributes of a sound source. 
Fig, 3 shows an example where spatiality extension 
information is added to a «OirectiveSound* node of Audio 

30 BIFS in accordance with the present invention. 

Referring to Fig. 3, a new rendering design 
corresponding to a value of a « Sour ceDimens ions" field is 
applied to the «DirectiveSound'' node. The 
"SourceDimensions" field also includes shape information of 

35 the sound source! If the value of the « Sour ceDimens ions" 

8 
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field is "0,0,0", the sound soiirce becomes one point, no 
additional technology for extending the sound source is 
applied to the "DirectiveSound" node. If the value of the 
-SourceDimensions" field«is a value* other than "0,0,0", the 
dimension of the sound source is extended virtually. 

The location and direction of the sound source are 
defined in a location field and a direccion field, 
respectively, in the "DirectiveSovind- node. The dimension 
of the sound source is extended in vertical to a vector 
defined in the direction field based on the value of the 
•SoureeDimensions" field. 

The "location" field defines the geometrical center of 
the extended sound source, whereas the *SourceDimensions" 
field defines the three-dimensional size of the sound 
15 source. In short, the size of the sound source extended 
spatially is determined according to the values of Ax, Ay 
and Az. 

Pig. 4 is a diagram illustrating how a sound source is 
extended in accordance with the present invention. As 
illustrated in the drawing, the value of the 
"SourceDimensions- field is (0, Ay, Az) , Ay and Az being 
not zero (Ayi»0, A«i»0). This indicates a surface sound 
source having an area of AyxAz. 

The illustrated sound source is extended in a 
direction vertical to a vector defined in the "direction- 
field based on the values of the "SourceDimensions* field, 
i.e., (0. Ay, Az), and thereby forming a surface sound 
source. As shown in the above, When the dimension and 
location of a sound source is defined, the point sound 
sources are located on the surfaces of the extended sound 
source. In the present invention, the locations of the 
point sound sources are calculated to be distributed on the 
surfaces of the extended sound source viniformly. 

Figs. 5 A to 5C are diagrams depicting the 
35 distributions of point sound sources based on the shapes of 

9 
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various sound sources in accordance with the present 
invention. The dimension and distance of a sound source 
are free variables. So, the size of the sound source that 
can be recognized by a aser can be formed freely. 

For example, multi-track audio signals that are 
recorded by using an array Of microphones can be expressed 
by extending point sound sources linearly as shovm in Fig. 
5A. In this case, the value of the "SourceDimensions- 

field is (0, 0, Az) . 

Also, different sound signals can be expressed as an 
extension of a point sound source to generate a spread 
sound source. Figs. 5B and SC show a surface sound source 
expressed through the spread of the point sound source and 
a spatial sound source having a volume. In case of Fig. 
58, the value of the "SourceDlmensions" field is (0, Ay, 
AZ) and. in case of Pig. 5C, the value of the 
•SourceDimensions" field is (Ax, Ay, Az) . 

AS the dimension of a spatial sound source is defined 
as described in the above, the nuinber of the point sound 
20 sources (i.e.. the nuinber of input audio channels) 
determines the density of the point sound sources in the 

extended sound source. 

If an •AudioSource" node is defined in a "source- 
field, the value of a "numChan- field may indicate the 
25 number of used point sound sources. The directivity 
defined in "angle," -directivity" and "frequency- fields of 
the -DirectiveSound" node can be applied to all point sound 
sources included in the extended sound source uniformly. 

The apparatus and method of the present invention can 
30 produce more effective three-dimensional sounds by 
extending the spatiality of sound sources of contents. 

While the present invention has been described with 
respect to certain preferred embodiments, it will be 
apparent to those skilled in the art that various changes 
35 and modifications may be made without departing from the 

10 
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of the invention as defined in the following claims, 
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